柔性章鱼臂具有卓越的能力,可以协调大量自由度并执行复杂的操纵任务。结果,这些系统继续吸引生物学家和机器人的注意力。在本文中,我们开发了一个三维模型的软章鱼臂,配备了生物力学上逼真的肌肉致动。考虑了所有主要肌肉群施加的内力和夫妇。描述了一种能量塑形控制方法来协调肌肉活动,以便在3D空间中掌握和触及。本文的主要贡献是:(i)主要肌肉群建模以引起三维运动; (ii)基于存储的能量功能的肌肉激活的数学公式; (iii)通过在特殊欧几里得组SE中解决优化问题获得的设计特定于任务的平衡配置的计算有效过程(3)。然后,根据优化问题解决方案引起的共同状态变量,对肌肉控制进行迭代计算。该方法在物理准确的软件环境弹性中得到了数值的证明。报告了模拟观察到的章鱼行为的数值实验的结果。
translated by 谷歌翻译
This paper describes the 5th edition of the Predicting Video Memorability Task as part of MediaEval2022. This year we have reorganised and simplified the task in order to lubricate a greater depth of inquiry. Similar to last year, two datasets are provided in order to facilitate generalisation, however, this year we have replaced the TRECVid2019 Video-to-Text dataset with the VideoMem dataset in order to remedy underlying data quality issues, and to prioritise short-term memorability prediction by elevating the Memento10k dataset as the primary dataset. Additionally, a fully fledged electroencephalography (EEG)-based prediction sub-task is introduced. In this paper, we outline the core facets of the task and its constituent sub-tasks; describing the datasets, evaluation metrics, and requirements for participant submissions.
translated by 谷歌翻译
The Predicting Media Memorability task in the MediaEval evaluation campaign has been running annually since 2018 and several different tasks and data sets have been used in this time. This has allowed us to compare the performance of many memorability prediction techniques on the same data and in a reproducible way and to refine and improve on those techniques. The resources created to compute media memorability are now being used by researchers well beyond the actual evaluation campaign. In this paper we present a summary of the task, including the collective lessons we have learned for the research community.
translated by 谷歌翻译
Current state-of-the-art approaches to text classification typically leverage BERT-style Transformer models with a softmax classifier, jointly fine-tuned to predict class labels of a target task. In this paper, we instead propose an alternative training objective in which we learn task-specific embeddings of text: our proposed objective learns embeddings such that all texts that share the same target class label should be close together in the embedding space, while all others should be far apart. This allows us to replace the softmax classifier with a more interpretable k-nearest-neighbor classification approach. In a series of experiments, we show that this yields a number of interesting benefits: (1) The resulting order induced by distances in the embedding space can be used to directly explain classification decisions. (2) This facilitates qualitative inspection of the training data, helping us to better understand the problem space and identify labelling quality issues. (3) The learned distances to some degree generalize to unseen classes, allowing us to incrementally add new classes without retraining the model. We present extensive experiments which show that the benefits of ante-hoc explainability and incremental learning come at no cost in overall classification accuracy, thus pointing to practical applicability of our proposed approach.
translated by 谷歌翻译
Autonomous driving requires efficient reasoning about the location and appearance of the different agents in the scene, which aids in downstream tasks such as object detection, object tracking, and path planning. The past few years have witnessed a surge in approaches that combine the different taskbased modules of the classic self-driving stack into an End-toEnd(E2E) trainable learning system. These approaches replace perception, prediction, and sensor fusion modules with a single contiguous module with shared latent space embedding, from which one extracts a human-interpretable representation of the scene. One of the most popular representations is the Birds-eye View (BEV), which expresses the location of different traffic participants in the ego vehicle frame from a top-down view. However, a BEV does not capture the chromatic appearance information of the participants. To overcome this limitation, we propose a novel representation that captures various traffic participants appearance and occupancy information from an array of monocular cameras covering 360 deg field of view (FOV). We use a learned image embedding of all camera images to generate a BEV of the scene at any instant that captures both appearance and occupancy of the scene, which can aid in downstream tasks such as object tracking and executing language-based commands. We test the efficacy of our approach on synthetic dataset generated from CARLA. The code, data set, and results can be found at https://rebrand.ly/APP OCC-results.
translated by 谷歌翻译
In this research work, we have demonstrated the application of Mask-RCNN (Regional Convolutional Neural Network), a deep-learning algorithm for computer vision and specifically object detection, to semiconductor defect inspection domain. Stochastic defect detection and classification during semiconductor manufacturing has grown to be a challenging task as we continuously shrink circuit pattern dimensions (e.g., for pitches less than 32 nm). Defect inspection and analysis by state-of-the-art optical and e-beam inspection tools is generally driven by some rule-based techniques, which in turn often causes to misclassification and thereby necessitating human expert intervention. In this work, we have revisited and extended our previous deep learning-based defect classification and detection method towards improved defect instance segmentation in SEM images with precise extent of defect as well as generating a mask for each defect category/instance. This also enables to extract and calibrate each segmented mask and quantify the pixels that make up each mask, which in turn enables us to count each categorical defect instances as well as to calculate the surface area in terms of pixels. We are aiming at detecting and segmenting different types of inter-class stochastic defect patterns such as bridge, break, and line collapse as well as to differentiate accurately between intra-class multi-categorical defect bridge scenarios (as thin/single/multi-line/horizontal/non-horizontal) for aggressive pitches as well as thin resists (High NA applications). Our proposed approach demonstrates its effectiveness both quantitatively and qualitatively.
translated by 谷歌翻译
Recent mean field interpretations of learning dynamics in over-parameterized neural networks offer theoretical insights on the empirical success of first order optimization algorithms in finding global minima of the nonconvex risk landscape. In this paper, we explore applying mean field learning dynamics as a computational algorithm, rather than as an analytical tool. Specifically, we design a Sinkhorn regularized proximal algorithm to approximate the distributional flow from the learning dynamics in the mean field regime over weighted point clouds. In this setting, a contractive fixed point recursion computes the time-varying weights, numerically realizing the interacting Wasserstein gradient flow of the parameter distribution supported over the neuronal ensemble. An appealing aspect of the proposed algorithm is that the measure-valued recursions allow meshless computation. We demonstrate the proposed computational framework of interacting weighted particle evolution on binary and multi-class classification. Our algorithm performs gradient descent of the free energy associated with the risk functional.
translated by 谷歌翻译
动态磁共振成像(MRI)是一种流行的医学成像技术,可生成组织和器官内部对比度材料流动的图像序列。但是,仅在少数可行性研究中证明了它在通过食道运动中的成像运动中的应用,并且相对尚未探索。在这项工作中,我们提出了一个称为力学的MRI(MRI-MEC)的计算框架,该计算框架增强了该能力,从而增加了动态MRI在诊断食管疾病中的适用性。菠萝汁用作动态MRI的吞咽对比材料,MRI图像序列被用作MRI-MECH的输入。 MRI-MECH将食道建模为柔性的一维管,弹性管壁遵循线性管定律。然后,通过一维质量和动量保护方程式,通过食道流动。这些方程是使用物理信息的神经网络(PINN)求解的。 PINN最大程度地减少了MRI测量和模型预测之间的差异,以确保始终遵循流体流量问题的物理。 MRI-Mech计算了食管转运期间的流体速度和压力,并通过计算壁刚度和主动弛豫来估计食道健康的机械健康。此外,MRI-Mech预测了在排空过程中有关下食管下括约肌的缺失信息,这证明了其适用于缺少数据或图像分辨率差的方案。除了基于食管机械健康的定量估计值来改善临床决策外,MRI-MECH还可以增强用于应用其他医学成像方式以增强其功能。
translated by 谷歌翻译
资源说明框架(RDF)和属性图(PG)是表示,存储和查询图数据的两个最常用的数据模型。我们提出了表达推理图存储(ERGS) - 构建在Janusgraph(属性图存储)顶部的图存储,该图还允许存储和查询RDF数据集。首先,我们描述了如何将RDF数据转换为属性图表示,然后描述将SPARQL查询转换为一系列Gremlin遍历的查询翻译模块。因此,开发的转换器和翻译器可以允许任何Apache TinkerPop符合图形数据库存储和查询RDF数据集。我们证明了使用JanusGraph作为基本属性图存储的建议方法的有效性,并将其性能与标准RDF系统进行比较。
translated by 谷歌翻译
我们提出了在概率密度函数(PDFS)的基础变量(即订单参数)的概率密度函数(PDF)中为胶体自组装的有限的随机最佳控制问题。控制目标是根据将状态PDF从规定的初始概率指标转向最小控制工作的规定终端概率指标的提出的。为了特异性,我们使用文献中的单变量随机状态模型。本文开发的分析和对照合成的计算步骤都推广为仿制药在状态中的多元随机状态动力学,在对照模型中给出了非伴随。我们为相关的最佳控制问题得出了最佳条件。该推导产生一个由三个耦合部分微分方程的系统,以及在初始和终端时间的边界条件。最终的系统是所谓的Schr \“ {O} dinger桥问题的广义实例。然后,我们通过训练物理知识的深神经网络来确定最佳控制策略,其中“物理学”是最优化的派生条件。通过基准胶体自组装问题的数值模拟,该解决方案的性能得到了证明。
translated by 谷歌翻译